



# CS305: Computer Architecture

Dynamic Instruction Scheduling

https://www.cse.iitb.ac.in/~biswa/courses/CS305/main.html

## Recap

Remember out-of-order processor

Inorder fetch, out-of-order execute, inorder commit

Now, we will discuss in details

Static scheduling: Compiler can do; loop unrolling etc.

# Tomasulo's Organization for dynamic



### Points to remember

- Control & buffers <u>distributed</u> with Function Units (FU)
  - FU buffers called "reservation stations"; have pending operands
- Registers in instructions replaced by values or pointers to reservation stations(RS); called register renaming;
  - avoids WAR, WAW hazards
  - More reservation stations than registers, so can do optimizations compilers can't
- Results to FU from RS over <u>Common Data Bus</u> that broadcasts results to all FUs
- Load and Stores treated as FUs with RSs as wells
- Decode stage of the pipeline: becomes two stages:

Issue: Decode instructions, check structural hazards

Read operands: Wait until no data hazards, then read operands.

Some processors use the term dispatch and issue.

## Reservation Station Components

Op: Operation to perform in the unit (e.g., + or −)

Vj, Vk: Value of Source operands

Store buffers has V field, result to be stored

Qj, Qk: Reservation stations producing source registers (value to be written)

- -Qj,Qk=0 => ready
- Store buffers only have Qi for RS producing result

Busy: Indicates reservation station or FU is busy

Register result status—Indicates which functional unit will write each register, if one exists. Blank when no pending instructions that will write that register.

# Three Stages with Tomasulo 1. Issue—get instruction from FP Op/instruction Queue

If reservation station free (no structural hazard), control issues instr & sends operands (renames registers).

2. Execution—operate on operands (EX)

When both operands ready then execute; if not ready, watch Common Data Bus for result

3. Write result—finish execution (WB)

Write on Common Data Bus to all awaiting units; mark reservation station available

- Normal data bus: data + destination ("go to" bus)
- <u>Common data bus</u>: data + <u>source</u> ("<u>come from</u>" bus)
  - 64 bits of data + 4 bits of Functional Unit source address
  - Write if matches expected Functional Unit (produces result)
  - Does the broadcast

## The New Pipeline

**Inorder** Instruction fetch

Fetched instructions enqueued into a Q called Instruction Q (IQ).

Inorder instruction issue from the IQ

**Outoforder** execution

New concept of register renaming through reservation stations that eliminates WAR and WAW hazards

### An Example

0

FU





Load: 2 cycle FP add: 2 cycles

FP multiply: 10 cycles

FP divide: 40 cycles



#### Register result status:

Clock F0 F2 F4 F6 F8 F10 F12 ... F30
2 FU Load2 Load1

#### can have multiple loads outstanding



#### Register result status:

Clock F2 F4 F6 F8 F10 F12 ... F30

Mult1 Load2 Load1

· Note: registers names are removed ("renamed") in Reservation Stations; MULT issued



#### Register result status:

Clock F0 F2 F4 F6 F8 F10 F12 ... F30 FU Mult1 Load2 M(A1) Add1

Load2 completing; what is waiting for Load2?
 Computer Architecture



| Instruction | n sta | tus:   |                  |       | Exec  | Write      |       |       |      |         |
|-------------|-------|--------|------------------|-------|-------|------------|-------|-------|------|---------|
| Instruction | on    | j      | $\boldsymbol{k}$ | Issue | Comp  | Result     |       |       | Busy | Address |
| LD          | F6    | 34+    | R2               | 1     | 3     | 4          |       | Load1 | No   |         |
| LD          | F2    | 45+    | <b>R</b> 3       | 2     | 4     | 5          |       | Load2 | No   |         |
| MULTD       | FO    | F2     | F4               | 3     |       |            |       | Load3 | No   |         |
| SUBD        | F8    | F6     | F2               | 4     |       |            |       |       |      |         |
| DIVD        | F10   | FO     | F6               | 5     |       |            |       |       |      |         |
| ADDD        | F6    | F8     | F2               | 6     |       |            |       |       |      |         |
| Reservatio  | on St | ations | <b>5:</b>        |       | S1    | <i>S</i> 2 | RS    | RS    |      |         |
|             | Time  | Name   | Busy             | Op    | Vj    | Vk         | Qj    | Qk    |      |         |
|             | 1     | Add1   | Yes              | SUBD  | M(A1) | M(A2)      |       |       |      |         |
|             |       | Add2   | Yes              | ADDD  |       | M(A2)      | Add1  |       |      |         |
|             |       | Add3   | No               |       |       |            |       |       |      |         |
|             | 9     | Mult1  | Yes              | MULTD | M(A2) | R(F4)      |       |       |      |         |
|             |       | Mult2  | Yes              | DIVD  |       | M(A1)      | Mult1 |       |      |         |

#### Register result status:

| Clock |    | FO    | F2    | F4 | <i>F6</i> | F8   | F10   | <i>F12</i> | ••• | F30 |
|-------|----|-------|-------|----|-----------|------|-------|------------|-----|-----|
| 6     | FU | Mult1 | M(A2) |    | Add2      | Add1 | Mult2 |            |     |     |

#### · Issue ADDD here



#### Register result status:

Clock F0 F2 F4 F6 F8 F10 F12 ... F30 FU Mult1 M(A2) Add2 Add1 Mult2

· Add1 completing; what is waiting for it?

| Instructio  | n sta         | tus:   |            |       | Exec  | Write      |       |       |      |         |
|-------------|---------------|--------|------------|-------|-------|------------|-------|-------|------|---------|
| Instruction | Instruction j |        |            | Issue | Comp  | Result     |       |       | Busy | Address |
| LD          | F6            | 34+    | R2         | 1     | 3     | 4          |       | Load1 | No   |         |
| LD          | F2            | 45+    | R3         | 2     | 4     | 5          |       | Load2 | No   |         |
| MULTD       | FO            | F2     | F4         | 3     |       |            |       | Load3 | No   |         |
| SUBD        | F8            | F6     | F2         | 4     | 7     | 8          |       |       |      |         |
| DIVD        | F10           | FO     | F6         | 5     |       |            |       |       |      |         |
| ADDD        | F6            | F8     | F2         | 6     |       |            |       |       |      |         |
| Reservatio  | on St         | ations | <b>5</b> : |       | S1    | <i>S</i> 2 | RS    | RS    |      |         |
|             | Time          | Name   | Busy       | Op    | Vj    | Vk         | Qj    | Qk    |      |         |
|             |               | Add1   | No         |       |       |            |       |       |      |         |
|             | 2             | Add2   | Yes        | ADDD  | (M-M) | M(A2)      |       |       |      |         |
|             |               | Add3   | No         |       |       |            |       |       |      |         |
|             | 7             | Mult1  | Yes        | MULTD | M(A2) | R(F4)      |       |       |      |         |
|             |               | Mult2  | Yes        | DIVD  |       | M(A1)      | Mult1 |       | ]    |         |

#### Register result status:

| Clock |    | F0    | F2    | <i>F4</i> | <i>F6</i> | F8    | F10   | <i>F12</i> | ••• | F30 |
|-------|----|-------|-------|-----------|-----------|-------|-------|------------|-----|-----|
| 8     | FU | Mult1 | M(A2) |           | Add2      | (M-M) | Mult2 |            |     |     |

| Instructio    | n sta | tus:   |                  |       | Exec  | Write      |       |       |      |         |
|---------------|-------|--------|------------------|-------|-------|------------|-------|-------|------|---------|
| Instruction j |       |        | $\boldsymbol{k}$ | Issue | Comp  | Result     |       |       | Busy | Address |
| LD            | F6    | 34+    | R2               | 1     | 3     | 4          |       | Load1 | No   |         |
| LD            | F2    | 45+    | <b>R</b> 3       | 2     | 4     | 5          |       | Load2 | No   |         |
| MULTD         | F0    | F2     | F4               | 3     |       |            |       | Load3 | No   |         |
| SUBD          | F8    | F6     | F2               | 4     | 7     | 8          |       |       |      |         |
| DIVD          | F10   | FO     | F6               | 5     |       |            |       |       |      |         |
| ADDD          | F6    | F8     | F2               | 6     |       |            |       |       |      |         |
| Reservatio    | on St | ations | s:               |       | S1    | <i>S</i> 2 | RS    | RS    |      |         |
|               | Time  | Name   | Busy             | Op    | Vj    | Vk         | Qj    | Qk    |      |         |
|               |       | Add1   | No               |       |       |            |       |       | ]    |         |
|               | 1     | Add2   | Yes              | ADDD  | (M-M) | M(A2)      |       |       |      |         |
|               |       | Add3   | No               |       |       |            |       |       |      |         |
|               | 6     | Mult1  | Yes              | MULTD | M(A2) | R(F4)      |       |       |      |         |
|               |       | Mult2  | Yes              | DIVD  |       | M(A1)      | Mult1 |       |      |         |

#### Register result status:

| In            | structio  | n stai | tus:             |            |       | Exec      | Write      |       |       |         |  |
|---------------|-----------|--------|------------------|------------|-------|-----------|------------|-------|-------|---------|--|
| Instruction j |           |        | $\boldsymbol{k}$ | Issue      | Comp  | Result    |            |       | Busy  | Address |  |
|               | LD        | F6     | 34+              | R2         | 1     | 3         | 4          |       | Load1 | No      |  |
|               | LD        | F2     | 45+              | <b>R</b> 3 | 2     | 4         | 5          |       | Load2 | No      |  |
|               | MULTD     | FO     | F2               | F4         | 3     |           |            |       | Load3 | No      |  |
|               | SUBD      | F8     | F6               | F2         | 4     | 7         | 8          |       |       |         |  |
|               | DIVD      | F10    | FO               | F6         | 5     |           |            |       |       |         |  |
|               | ADDD      | F6     | F8               | F2         | 6     | 10        |            |       |       |         |  |
| Re            | eservatio | on St  | ations           | s:         |       | <i>S1</i> | <i>S</i> 2 | RS    | RS    |         |  |
|               |           | Time   | Name             | Busy       | Op    | Vj        | Vk         | Qj    | Qk    |         |  |
|               |           |        | Add1             | No         |       |           |            |       |       |         |  |
|               |           | 0      | Add2             | Yes        | ADDD  | (M-M)     | M(A2)      |       |       |         |  |
|               |           |        | Add3             | No         |       |           |            |       |       |         |  |
|               |           | 5      | Mult1            | Yes        | MULTE | M(A2)     | R(F4)      |       |       |         |  |
|               |           |        | Mult2            | Yes        | DIVD  |           | M(A1)      | Mult1 |       |         |  |
|               |           |        |                  |            |       |           |            |       |       |         |  |

#### Register result status:

· Add2 completing; what is waiting for it?

```
Instruction status:
                                  Exec Write
                                  Comp Result
   Instruction
                                                            Busy Address
                        k
                            Issue
   LD
            F6
                 34 +
                       R2
                                                             No
                                                     Load1
            F2
                 45+
                       R3
   LD
                                                     Load2
                                                             No
   MULTD
            F0
                  F2
                       F4
                                                     Load3
                                                             No
   SUBD
                  F6
                                           8
   DIVD
            F10
                  F0
                       F6
                  F8
                       F2
   ADDD
            F6
                                    10
                                          11
Reservation Stations:
                                    SI
                                          S2
                                                RS
                                                      RS
           Time Name Busy
                                          Vk
                             Op
                                                Qj
                                                      Qk
                Add1
                       No
                Add2
                       No
                Add3
                       No
                       Yes MULTD M(A2) R(F4)
               4 Mult1
                Mult2
                       Yes
                            DIVD
                                         M(A1) Mult1
```

#### Register result status:

| Clock |    | F0    | F2    | <i>F4</i> | <i>F6</i>            | F8    | F10   | <i>F12</i> | ••• | F30 |
|-------|----|-------|-------|-----------|----------------------|-------|-------|------------|-----|-----|
| 11    | FU | Mult1 | M(A2) |           | $\overline{(M-M+N)}$ | (M-M) | Mult2 |            |     |     |

- · Write result of ADDD here.
- All quick instructions complete in this cycle! Computer Architecture



```
Instruction status:
                                  Exec Write
                                  Comp Result
                                                           Busy Address
   Instruction
                       \boldsymbol{k}
                            Issue
   LD
                                                             No
            F6
                 34 +
                       R2
                                    3
                                                     Load1
                                           4
   LD
                 45+
                       R3
            F2
                                           5
                                                     Load2
                                                             No
   MULTD
            F0
                                                     Load3
                 F2
                       F4
                                                             No
   SUBD
            F8
                 F6
                       F2
                                    7
                                           8
   DIVD
            F10
                 F0
                       F6
   ADDD
            F6
                  F8
                       F2
                                    10
                                          11
                              6
Reservation Stations:
                                   S1
                                          S2
                                                RS
                                                      RS
                                    Vj
                                          Vk
           Time Name Busy
                             Op
                                                Qj
                                                      Qk
                Add1
                       No
                Add2
                       No
                Add3
                       No
                       Yes MULTD M(A2) R(F4)
              2 Mult1
                Mult2
                      Yes
                            DIVD
                                        M(A1) Mult1
Register result status:
                                                                   F12
   Clock
                             F0
                                   F2
                                         F4
                                                F6
                                                      F8
                                                            F10
                                                                               F30
                                  M(A2)
     13
                       FU
                            Mult1
                                             (M-M+N(M-M) Mult2
```





**16** 

```
Instruction status:
                                 Exec Write
                                 Comp Result
                                                         Busy Address
   Instruction
                       k
                           Issue
  LD
            F6
                34 +
                      R2
                                   3
                                                   Load1
                                                           No
                                         4
  LD
                45+
                      R3
            F2
                                                   Load2
                                                           No
   MULTD
            F0
                      F4
                                                   Load3
                 F2
                                   15
                                                           No
                                         16
   SUBD
           F8
                 F6
                      F2
                                         8
   DIVD
           F10
                 F0
                      F6
   ADDD
            F6
                 F8
                      F2
                             6
                                   10
                                         11
Reservation Stations:
                                  S1
                                        S2
                                              RS
                                                     RS
                                   Vj
                                         Vk
          Time Name Busy
                            Op
                                               Qj
                                                     Qk
               Add1
                      No
               Add2
                      No
               Add3
                      No
               Mult1
                      No
             40 Mult2
                      Yes
                           DIVD
                                 M*F4 M(A1)
Register result status:
                                                    F8
   Clock
                            F0
                                  F2
                                        F4
                                              F6
                                                          F10
                                                                 F12
                                                                             F30
```

M\*F4

FU

M(A2)

#### **Computer Architecture**

Mult2

(M-M+N.(M-M))





#### Register result status:

· Mult2 is completing; what is waiting for it?



#### Register result status:

• Once again: In-order issue, out-of-order execution and completion. Please note we have not introduced branch prediction, speculation, exception into this scheduling yet. So out of order completion is ok so far.

## Register Renaming

- Tomasulo provides *Implicit Register Renaming* 
  - User registers renamed to reservation station tags
- Explicit Register Renaming:
  - Use physical register file that is larger than number of registers specified by ISA
- Keep a translation table:
  - ISA register => physical register mapping
  - Physical register becomes free when not being used by any instructions in progress. More later after ROB.

## Explicit Register Renaming

- Rapid access to a table of translations
- A physical register file that has more registers than specified by the ISA
- Ability to figure out which physical registers are free.
  - -No free registers  $\Rightarrow$  stall on issue
- Thus, register renaming doesn't require reservation stations.
   However:
  - Many modern architectures use explicit register renaming +
     Tomasulo-like reservation stations to control execution.

## ευχαριστώ